Visualisation and Intercomparison of GNSS In-Situ ZTD and IWV Time Series¶

Hugo Breton1 (breton@ipgp.fr)
Olivier Bock1 (bock@ipgp.fr)
1Institut de Physique du Globe de Paris (IPGP)

Notebook developped in April 2025


In this notebook, we show how to download and concatenate files served by the CDS for the dataset insitu-observations-gnss.
The script allows to visualise and intercompare time series from IGS-daily and IGS-repro3 GNSS datasets available on CDS, both in ZTD and IWV units, and compare GNSS IWV to ERA5 IWV estimates.

We use the following Python libraries:

  • xarray
  • numpy
  • pandas
  • seaborn
  • matplotlib

Note: You need the cads-api-client to download data. Please ensure you have accepted the dataset terms and saved your access credentials in ~/.cads-api-client.json.

Functions used in this notebook are maintained on GitHub:
🔗 https://github.com/ipgpbreton/GNSStoolbox.git

In [42]:
%matplotlib inline
import xarray as xr
import pandas as pd
import numpy as np
from cads_api_client import ApiClient
import os
from pathlib import Path
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns
import Visualization_functions as vf
import importlib
importlib.reload(vf)
import psutil
In [43]:
process = psutil.Process()
print(f"Memory usage: {process.memory_info().rss / 1024 ** 2:.2f} MB")
Memory usage: 50702.56 MB

C3S API Request¶

The GNSS data used in this analysis is available from the Copernicus Climate Data Store (CDS). Access the dataset through the CDS GNSS Observations Portal.

In [3]:
variable = "total_column_water_vapour_era5"  # zenith_total_delay or total_column_water_vapour or total_column_water_vapour_era5
starting_year, ending_year = 1996, 2014
network_type = "epn_repro2"  # igs_repro3 or igs_daily or epn_repro2
In [4]:
custom_base = input(
    "Enter base directory to save files (leave empty to use current folder): "
).strip()
base_dir = Path(custom_base) if custom_base else Path.cwd()
base_dir = base_dir / "Datas" / "EGU2025"  # /home/hbreton/Documents
In [298]:
client = ApiClient()
dataset = "insitu-observations-gnss"
[2024-06-16T00:00:00] CDS API syntax is changed and some keys or parameter names may have also changed. To avoid requests failing, please use the "Show API request code" tool on the dataset Download Form to check you are using the correct syntax for your API request.
In [299]:
for year in range(starting_year, ending_year + 1):
    year_str = str(year)
    request = {
        "network_type": network_type,
        "variable": [variable],
        "year": year_str,
        "month": [f"{m:02d}" for m in range(1, 13)],
        "day": [f"{d:02d}" for d in range(1, 32)],
        "format": "netcdf",
    }
    file_path = f"{base_dir}/{year_str}/{variable}_{network_type}_{year_str}.nc"

    if os.path.exists(file_path):
        print(f"File already exists: {file_path}. Skipping download.")
    else:
        try:
            os.makedirs(os.path.dirname(file_path), exist_ok=True)
            client.retrieve(dataset, **request, target=file_path)
            print(f"Download successful for {year_str}")
        except Exception as e:
            print(f"Error during download for {year_str}: {e}")
f38f961ce993608eebbcffa781f653c0.nc:   0%|          | 0.00/2.06M [00:00<?, ?B/s]
Download successful for 1996
4741e109ad11ab7e32d1c2d4fd3c3289.nc:   0%|          | 0.00/3.04M [00:00<?, ?B/s]
Download successful for 1997
740528945ccde586465bf283063e90a6.nc:   0%|          | 0.00/3.99M [00:00<?, ?B/s]
Download successful for 1998
e65ecb75c5f96ce2bad2f939b63b4ad3.nc:   0%|          | 0.00/4.51M [00:00<?, ?B/s]
Download successful for 1999
c6c987b6f52aa01a6157341f44549bff.nc:   0%|          | 0.00/5.45M [00:00<?, ?B/s]
Download successful for 2000
176a4c9a848ce9e58e47b40fae027626.nc:   0%|          | 0.00/6.55M [00:00<?, ?B/s]
Download successful for 2001
3bd87a5fce8821254fafdbd0a0a25a9b.nc:   0%|          | 0.00/7.30M [00:00<?, ?B/s]
Download successful for 2002
6a40a3791f9c53ca45392ba9261783aa.nc:   0%|          | 0.00/8.06M [00:00<?, ?B/s]
Download successful for 2003
22af69372b786bc3ee024e22d3ff51a3.nc:   0%|          | 0.00/8.66M [00:00<?, ?B/s]
Download successful for 2004
d6b0ce6ac2ae9d8272190a4b506feb02.nc:   0%|          | 0.00/9.95M [00:00<?, ?B/s]
Download successful for 2005
40c915d4f311bfbe1ac555c884a1a89d.nc:   0%|          | 0.00/10.8M [00:00<?, ?B/s]
Download successful for 2006
5bf6f381da18bd2a5633ed2c66ffd19e.nc:   0%|          | 0.00/10.9M [00:00<?, ?B/s]
Download successful for 2007
4904e0e0b6f11c725dbe325a5e01f2b3.nc:   0%|          | 0.00/12.1M [00:00<?, ?B/s]
Download successful for 2008
2cd7284df6d7a8d58c3a9291ff5db60a.nc:   0%|          | 0.00/13.0M [00:00<?, ?B/s]
Download successful for 2009
dbc3d50f9c5df1b5cf449abd643d7369.nc:   0%|          | 0.00/13.6M [00:00<?, ?B/s]
Download successful for 2010
3d1f23ccd116e64fcd5faebc10f82eed.nc:   0%|          | 0.00/13.8M [00:00<?, ?B/s]
Download successful for 2011
5b6142698d930a63340cc02bc278bb08.nc:   0%|          | 0.00/14.0M [00:00<?, ?B/s]
Download successful for 2012
bbf281593a3c2fd467534e2285089fca.nc:   0%|          | 0.00/13.7M [00:00<?, ?B/s]
Download successful for 2013
a55ea7b6005bad3fdccdd70a1051894.nc:   0%|          | 0.00/13.5M [00:00<?, ?B/s]
Download successful for 2014

Concatenation of the yearly files¶

In [5]:
datasets = []
for year in range(starting_year, ending_year + 1):
    file_path = f"/{base_dir}/{year}/{variable}_{network_type}_{year}.nc"
    if os.path.exists(file_path):
        cadsobs_dataset = xr.open_dataset(file_path, engine="netcdf4")
        cadsobs_dataset.observed_variable.to_index().unique()
        cadsobs_dataset[
            "observed_variable"
        ] = cadsobs_dataset.observed_variable.str.decode("UTF-8")
        df_cadsobs_dataset = (
            cadsobs_dataset.to_dataframe()
            .set_index(["primary_station_id", "observed_variable"], inplace=False)
            .sort_index(level="primary_station_id")
        )
        datasets.append(df_cadsobs_dataset)
        print(f"Dataset loaded for {year}: {file_path}")
    else:
        print(f"File not found: {file_path}")
Dataset loaded for 1996: //home/hbreton/Documents/Datas/EGU2025/1996/total_column_water_vapour_era5_epn_repro2_1996.nc
Dataset loaded for 1997: //home/hbreton/Documents/Datas/EGU2025/1997/total_column_water_vapour_era5_epn_repro2_1997.nc
Dataset loaded for 1998: //home/hbreton/Documents/Datas/EGU2025/1998/total_column_water_vapour_era5_epn_repro2_1998.nc
Dataset loaded for 1999: //home/hbreton/Documents/Datas/EGU2025/1999/total_column_water_vapour_era5_epn_repro2_1999.nc
Dataset loaded for 2000: //home/hbreton/Documents/Datas/EGU2025/2000/total_column_water_vapour_era5_epn_repro2_2000.nc
Dataset loaded for 2001: //home/hbreton/Documents/Datas/EGU2025/2001/total_column_water_vapour_era5_epn_repro2_2001.nc
Dataset loaded for 2002: //home/hbreton/Documents/Datas/EGU2025/2002/total_column_water_vapour_era5_epn_repro2_2002.nc
Dataset loaded for 2003: //home/hbreton/Documents/Datas/EGU2025/2003/total_column_water_vapour_era5_epn_repro2_2003.nc
Dataset loaded for 2004: //home/hbreton/Documents/Datas/EGU2025/2004/total_column_water_vapour_era5_epn_repro2_2004.nc
Dataset loaded for 2005: //home/hbreton/Documents/Datas/EGU2025/2005/total_column_water_vapour_era5_epn_repro2_2005.nc
Dataset loaded for 2006: //home/hbreton/Documents/Datas/EGU2025/2006/total_column_water_vapour_era5_epn_repro2_2006.nc
Dataset loaded for 2007: //home/hbreton/Documents/Datas/EGU2025/2007/total_column_water_vapour_era5_epn_repro2_2007.nc
Dataset loaded for 2008: //home/hbreton/Documents/Datas/EGU2025/2008/total_column_water_vapour_era5_epn_repro2_2008.nc
Dataset loaded for 2009: //home/hbreton/Documents/Datas/EGU2025/2009/total_column_water_vapour_era5_epn_repro2_2009.nc
Dataset loaded for 2010: //home/hbreton/Documents/Datas/EGU2025/2010/total_column_water_vapour_era5_epn_repro2_2010.nc
Dataset loaded for 2011: //home/hbreton/Documents/Datas/EGU2025/2011/total_column_water_vapour_era5_epn_repro2_2011.nc
Dataset loaded for 2012: //home/hbreton/Documents/Datas/EGU2025/2012/total_column_water_vapour_era5_epn_repro2_2012.nc
Dataset loaded for 2013: //home/hbreton/Documents/Datas/EGU2025/2013/total_column_water_vapour_era5_epn_repro2_2013.nc
Dataset loaded for 2014: //home/hbreton/Documents/Datas/EGU2025/2014/total_column_water_vapour_era5_epn_repro2_2014.nc
In [6]:
df = pd.concat(datasets, axis=0)
output_dir = base_dir / "full_time_series_data"
output_dir.mkdir(parents=True, exist_ok=True)
df.to_parquet(
    f"{base_dir}/full_time_series_data/{variable}_{network_type}_{starting_year}_{ending_year}.parquet"
)
In [7]:
del datasets


Dataframes¶

In [4]:
custom_base = input(
    "Enter base directory where you saves your files (leave empty to use current folder): "
).strip()
base_dir = Path(custom_base) if custom_base else Path.cwd()
base_dir = base_dir / "Datas" / "EGU2025"  # /home/hbreton/Documents

IGS REPRO3¶

In [5]:
starting_year, ending_year = 2000, 2019 #adapt to your data

df_ZTD_igs_repro3 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/zenith_total_delay_igs_repro3_{starting_year}_{ending_year}.parquet"
)
df_ZTD_igs_repro3 = df_ZTD_igs_repro3.reset_index()
df_ZTD_igs_repro3["primary_station_id"] = df_ZTD_igs_repro3[
    "primary_station_id"
].astype(str)
df_ZTD_igs_repro3 = df_ZTD_igs_repro3.set_index(
    ["primary_station_id", "observed_variable"], inplace=False
).sort_index(level="primary_station_id")
In [6]:
starting_year, ending_year = 2000, 2019 #adapt to your data

df_PWC_igs_repro3 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/total_column_water_vapour_igs_repro3_{starting_year}_{ending_year}.parquet"
)
df_PWC_igs_repro3 = df_PWC_igs_repro3.reset_index()
df_PWC_igs_repro3["primary_station_id"] = df_PWC_igs_repro3[
    "primary_station_id"
].astype(str)
df_PWC_igs_repro3 = df_PWC_igs_repro3.set_index(
    ["primary_station_id", "observed_variable"], inplace=False
).sort_index(level="primary_station_id")
In [7]:
starting_year, ending_year = 2000, 2019 #adapt to your data

df_PWC_era5_igs_repro3 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/total_column_water_vapour_era5_igs_repro3_{starting_year}_{ending_year}.parquet"
)
df_PWC_era5_igs_repro3 = df_PWC_era5_igs_repro3.reset_index()
df_PWC_era5_igs_repro3["primary_station_id"] = df_PWC_era5_igs_repro3[
    "primary_station_id"
].astype(str)
df_PWC_era5_igs_repro3 = df_PWC_era5_igs_repro3.set_index(
    ["primary_station_id", "observed_variable"], inplace=False
).sort_index(level="primary_station_id")
In [128]:
df_PWC_igs_repro3
Out[128]:
observation_id report_id observation_value units height_of_station_above_sea_level report_timestamp city uncertainty_type1 uncertainty_type5 uncertainty_units1 uncertainty_units5 uncertainty_value1 uncertainty_value5 longitude|station_configuration latitude|station_configuration
primary_station_id observed_variable
ABMF00GLP precipitable_water_column b'16782553' b'1' 53.900002 b'kg m-2' 15.252470 2008-07-16 00:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.72 -61.527538 16.262304
precipitable_water_column b'16782554' b'1' 47.369999 b'kg m-2' 15.252470 2008-07-16 01:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.71 -61.527538 16.262304
precipitable_water_column b'16782555' b'1' 49.630001 b'kg m-2' 15.252470 2008-07-16 02:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.68 -61.527538 16.262304
precipitable_water_column b'16782556' b'1' 48.529999 b'kg m-2' 15.252470 2008-07-16 03:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.56 -61.527538 16.262304
precipitable_water_column b'16782557' b'1' 46.389999 b'kg m-2' 15.252470 2008-07-16 04:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.54 -61.527538 16.262304
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
ZWEN00RUS precipitable_water_column b'8003708' b'422' 8.020000 b'kg m-2' 189.572144 2004-11-06 19:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.95 36.758629 55.699284
precipitable_water_column b'8003709' b'422' 7.920000 b'kg m-2' 189.572144 2004-11-06 20:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.98 36.758629 55.699284
precipitable_water_column b'8003710' b'422' 9.880000 b'kg m-2' 189.572144 2004-11-06 21:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.97 36.758629 55.699284
precipitable_water_column b'8003711' b'422' 8.490000 b'kg m-2' 189.572144 2004-11-06 22:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.82 36.758629 55.699284
precipitable_water_column b'8003712' b'422' 11.590000 b'kg m-2' 189.572144 2004-11-06 23:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.66 36.758629 55.699284

45666183 rows × 15 columns

IGS DAILY¶

In [9]:
starting_year, ending_year = 2000, 2023

df_PWC_igs_daily = pd.read_parquet(
    f"{base_dir}/full_time_series_data/total_column_water_vapour_igs_daily_{starting_year}_{ending_year}.parquet"
)
df_PWC_igs_daily = df_PWC_igs_daily.reset_index()
df_PWC_igs_daily["primary_station_id"] = df_PWC_igs_daily["primary_station_id"].astype(
    str
)
df_PWC_igs_daily = df_PWC_igs_daily.set_index(
    ["primary_station_id", "observed_variable"], inplace=False
).sort_index(level="primary_station_id")
In [ ]:
starting_year, ending_year = 2000, 2023

df_ZTD_igs_daily = pd.read_parquet(
    f"{base_dir}/full_time_series_data/zenith_total_delay_igs_daily_{starting_year}_{ending_year}.parquet"
)
df_ZTD_igs_daily = df_ZTD_igs_daily.reset_index()
df_ZTD_igs_daily["primary_station_id"] = df_ZTD_igs_daily["primary_station_id"].astype(
    str
)
df_ZTD_igs_daily = df_ZTD_igs_daily.set_index(
    ["primary_station_id", "observed_variable"], inplace=False
).sort_index(level="primary_station_id")
In [ ]:
starting_year, ending_year = 2000, 2023

df_PWC_era5_igs_daily = pd.read_parquet(
    f"{base_dir}/full_time_series_data/total_column_water_vapour_era5_igs_daily_{starting_year}_{ending_year}.parquet"
)
df_PWC_era5_igs_daily = df_PWC_era5_igs_daily.reset_index()
df_PWC_era5_igs_daily["primary_station_id"] = df_PWC_era5_igs_daily[
    "primary_station_id"
].astype(str)
df_PWC_era5_igs_daily = df_PWC_era5_igs_daily.set_index(
    ["primary_station_id", "observed_variable"], inplace=False
).sort_index(level="primary_station_id")
In [ ]:
df_PWC_igs_daily
Out[ ]:
observation_id report_id observation_value units height_of_station_above_sea_level report_timestamp city uncertainty_type1 uncertainty_type5 uncertainty_units1 uncertainty_units5 uncertainty_value1 uncertainty_value5 longitude|station_configuration latitude|station_configuration
primary_station_id observed_variable
ABMF00GLP precipitable_water_column b'21761809' b'453' 42.410000 b'kg m-2' 15.252470 2010-01-01 00:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.55 -61.527538 16.262304
precipitable_water_column b'21761810' b'453' 43.099998 b'kg m-2' 15.252470 2010-01-01 01:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.49 -61.527538 16.262304
precipitable_water_column b'21761811' b'453' 42.299999 b'kg m-2' 15.252470 2010-01-01 02:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.48 -61.527538 16.262304
precipitable_water_column b'21761812' b'453' 42.889999 b'kg m-2' 15.252470 2010-01-01 03:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.49 -61.527538 16.262304
precipitable_water_column b'21761813' b'453' 42.290001 b'kg m-2' 15.252470 2010-01-01 04:00:00 b'Les Abymes' 1 5 b'NA' b'kg m-2' NaN 0.48 -61.527538 16.262304
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
ZWEN00RUS precipitable_water_column b'8312468' b'184' 15.370000 b'kg m-2' 189.572144 2004-11-03 19:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.38 36.758629 55.699284
precipitable_water_column b'8312469' b'184' 15.800000 b'kg m-2' 189.572144 2004-11-03 20:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.41 36.758629 55.699284
precipitable_water_column b'8312470' b'184' 14.810000 b'kg m-2' 189.572144 2004-11-03 21:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.37 36.758629 55.699284
precipitable_water_column b'8312471' b'184' 14.960000 b'kg m-2' 189.572144 2004-11-03 22:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.40 36.758629 55.699284
precipitable_water_column b'8312472' b'184' 12.920000 b'kg m-2' 189.572144 2004-11-03 23:00:00 b'Zwenigorod' 1 5 b'NA' b'kg m-2' NaN 0.38 36.758629 55.699284

65283864 rows × 15 columns

EPN REPRO2¶

In [8]:
starting_year, ending_year = 1996, 2014 #adapt to your data

df_ZTD_epn_repro2 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/zenith_total_delay_epn_repro2_{starting_year}_{ending_year}.parquet"
)
df_ZTD_epn_repro2 = df_ZTD_epn_repro2.reset_index()
df_ZTD_epn_repro2["primary_station_id"] = df_ZTD_epn_repro2[
    "primary_station_id"
].astype(str)
df_ZTD_epn_repro2 = df_ZTD_epn_repro2.set_index(
    ["primary_station_id", "observed_variable"], inplace=False
).sort_index(level="primary_station_id")
In [9]:
starting_year, ending_year = 1996, 2014 #adapt to your data

df_PWC_epn_repro2 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/total_column_water_vapour_epn_repro2_{starting_year}_{ending_year}.parquet"
)
df_PWC_epn_repro2 = df_PWC_epn_repro2.reset_index()
df_PWC_epn_repro2["primary_station_id"] = df_PWC_epn_repro2[
    "primary_station_id"
].astype(str)
df_PWC_epn_repro2 = df_PWC_epn_repro2.set_index(
    ["primary_station_id", "observed_variable"], inplace=False
).sort_index(level="primary_station_id")
In [10]:
starting_year, ending_year = 1996, 2014 #adapt to your data

df_PWC_era5_epn_repro2 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/total_column_water_vapour_era5_epn_repro2_{starting_year}_{ending_year}.parquet"
)
df_PWC_era5_epn_repro2 = df_PWC_era5_epn_repro2.reset_index()
df_PWC_era5_epn_repro2["primary_station_id"] = df_PWC_era5_epn_repro2[
    "primary_station_id"
].astype(str)
df_PWC_era5_epn_repro2 = df_PWC_era5_epn_repro2.set_index(
    ["primary_station_id", "observed_variable"], inplace=False
).sort_index(level="primary_station_id")
In [16]:
df_PWC_epn_repro2
Out[16]:
observation_id report_id observation_value units height_of_station_above_sea_level report_timestamp city uncertainty_type1 uncertainty_type5 uncertainty_units1 uncertainty_units5 uncertainty_value1 uncertainty_value5 longitude latitude
primary_station_id observed_variable
ACOR00ESP precipitable_water_column b'1792297' b'105' 31.070000 b'kg m-2' 11.931510 1999-08-25 00:30:00 b'A Coruna' 1 5 b'NA' b'kg m-2' NaN 1.00 -8.398936 43.364380
precipitable_water_column b'1792298' b'105' 30.120001 b'kg m-2' 11.931510 1999-08-25 01:30:00 b'A Coruna' 1 5 b'NA' b'kg m-2' NaN 1.03 -8.398936 43.364380
precipitable_water_column b'1792299' b'105' 29.719999 b'kg m-2' 11.931510 1999-08-25 02:30:00 b'A Coruna' 1 5 b'NA' b'kg m-2' NaN 1.09 -8.398936 43.364380
precipitable_water_column b'1792300' b'105' 30.360001 b'kg m-2' 11.931510 1999-08-25 03:30:00 b'A Coruna' 1 5 b'NA' b'kg m-2' NaN 0.91 -8.398936 43.364380
precipitable_water_column b'1792301' b'105' 30.500000 b'kg m-2' 11.931510 1999-08-25 04:30:00 b'A Coruna' 1 5 b'NA' b'kg m-2' NaN 0.94 -8.398936 43.364380
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
ZYWI00POL precipitable_water_column b'27922292' b'226' 11.900000 b'kg m-2' 370.923523 2014-12-23 19:30:00 b'Zywiec' 1 5 b'NA' b'kg m-2' NaN 0.95 19.205954 49.686691
precipitable_water_column b'27922293' b'226' 12.480000 b'kg m-2' 370.923523 2014-12-23 20:30:00 b'Zywiec' 1 5 b'NA' b'kg m-2' NaN 1.11 19.205954 49.686691
precipitable_water_column b'27922294' b'226' 12.310000 b'kg m-2' 370.923523 2014-12-23 21:30:00 b'Zywiec' 1 5 b'NA' b'kg m-2' NaN 0.95 19.205954 49.686691
precipitable_water_column b'27922295' b'226' 12.410000 b'kg m-2' 370.923523 2014-12-23 22:30:00 b'Zywiec' 1 5 b'NA' b'kg m-2' NaN 1.10 19.205954 49.686691
precipitable_water_column b'27922296' b'226' 13.170000 b'kg m-2' 370.923523 2014-12-23 23:30:00 b'Zywiec' 1 5 b'NA' b'kg m-2' NaN 1.13 19.205954 49.686691

26325072 rows × 15 columns

Overview¶

In [21]:
repro3_stations = df_PWC_igs_repro3.index.get_level_values(
    "primary_station_id"
).nunique()

daily_stations = df_PWC_igs_daily.index.get_level_values(
    "primary_station_id"
).nunique()

repro2_stations = df_PWC_epn_repro2.index.get_level_values(
    "primary_station_id"
).nunique()

repro3_station_list = df_PWC_igs_repro3.index.get_level_values(
    "primary_station_id"
).unique()

daily_station_list = df_PWC_igs_daily.index.get_level_values(
    "primary_station_id"
).unique()

repro2_station_list = df_PWC_epn_repro2.index.get_level_values(
    "primary_station_id"
).unique()


common_stations_repro3_daily = set(repro3_station_list).intersection(set(daily_station_list))
common_stations_repro3_repro2 = set(repro3_station_list).intersection(set(repro2_station_list))
common_stations_daily_repro2 = set(daily_station_list).intersection(set(repro2_station_list))

print(f"Number of unique stations in igs_daily: {daily_stations}")
print(f"Number of unique stations in igs_repro3: {repro3_stations}")
print(f"Number of unique stations in epn_repro2: {repro2_stations} \n")

print(f"Number of common stations between igs_repro3 and igs_daily: {len(common_stations_repro3_daily)}")
print(f"Number of common stations between igs_daily and epn_repro2: {len(common_stations_daily_repro2)}")
print(f"Number of common stations between igs_repro3 and epn_repro2: {len(common_stations_repro3_repro2)}")
Number of unique stations in igs_daily: 682
Number of unique stations in igs_repro3: 422
Number of unique stations in epn_repro2: 305 

Number of common stations between igs_repro3 and igs_daily: 422
Number of common stations between igs_daily and epn_repro2: 105
Number of common stations between igs_repro3 and epn_repro2: 84

First Time Series¶

In [56]:
vf.plot_time_series(df_ZTD_igs_repro3, "MATE00ITA", "zenith_total_delay", "IGSrepro3")
vf.plot_time_series(df_ZTD_epn_repro2, "MATE00ITA", "zenith_total_delay", "EPNrepro2")
vf.plot_time_series(df_PWC_igs_daily, "FORT00BRA", "precipitable_water_column", "IGSdaily")
vf.plot_time_series(df_PWC_era5_igs_repro3, "MATE00ITA", "precipitable_water_column_era5", "IGSrepro3")
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image

Metadata from Sitelogs IGS files¶

Equipment Changes for IGS Stations¶

Equipment changes for the IGS stations are available in the form of site logs at the following link:
https://files.igs.org/pub/station/log/

In this study, we use a compact form of this information, which includes:

  • The date of installation,
  • Dates of subsequent changes,
  • And the type of change.

Here we only focus on 2 types of changes that are encoded as follows:

  • R: Receiver
  • A: Antenna

But can also be found data about (see in the link):

  • Radome
  • GNSS system
  • Receiver serial number
  • Cutoff angle
In [ ]:
try:
    df_equipment = pd.read_csv(
        "/home/hbreton/Documents/Datas/IGS/IGS.eq_change_list.RA.debfin.dates.csv" # Change this path to your file
    )
    df_equipment["Change_Date"] = pd.to_datetime(
        df_equipment["Change_Date"]
    ).dt.strftime("%Y-%m-%d %H:%M:%S")

except FileNotFoundError:
    # Create an empty dataframe with the same structure if file doesn't exist
    df_equipment = pd.DataFrame(
        {"Station": [], "Year": [], "Doy": [], "Change_Date": [], "Type": []}
    )
    print("Equipment file not found, creation of an empty dataframe")
In [ ]:
df_equipment #ensure your file got the right format
Out[ ]:
Station Year Doy Change_Date Type
0 abmf 2008 197 2008-07-15 00:00:00 RA
1 abmf 2008 364 2008-12-29 00:00:00 R
2 abmf 2009 288 2009-10-15 00:00:00 RA
3 abmf 2010 239 2010-08-27 00:00:00 R
4 abmf 2011 83 2011-03-24 00:00:00 R
... ... ... ... ... ...
8374 zwen 1996 32 1996-02-01 00:00:00 R
8375 zwen 1999 251 1999-09-08 00:00:00 R
8376 zwen 2000 251 2000-09-07 00:00:00 R
8377 zwen 2000 265 2000-09-21 00:00:00 A
8378 zwen 2004 266 2004-09-22 00:00:00 A

8379 rows × 5 columns


Compute differences¶

Each pair of differences can be computed separately and independently. It requires that you run the corresponding cell in the previous 'Dataframes' section

Compute ΔZTD[IGSDaily - IGSRepro3]¶

In [ ]:
df_repro3 = df_ZTD_igs_repro3.reset_index()[
    [
        "primary_station_id",
        "report_timestamp",
        "observation_value",
        "uncertainty_value1",
    ]
]
df_daily = df_ZTD_igs_daily.reset_index()[
    [
        "primary_station_id",
        "report_timestamp",
        "observation_value",
        "uncertainty_value1",
    ]
]

df_diff_daily_repro3 = pd.merge(
    df_daily,
    df_repro3,
    on=["primary_station_id", "report_timestamp"],
    how="inner",
    suffixes=("_daily", "_repro3"),
)

del df_daily
del df_repro3
In [ ]:
df_diff_daily_repro3["ZTD_diff"] = (
    df_diff_daily_repro3["observation_value_daily"]
    - df_diff_daily_repro3["observation_value_repro3"]
)
df_diff_daily_repro3 = df_diff_daily_repro3.rename(
    columns={
        "primary_station_id": "Station",
        "observation_value_daily": "ZTD_daily",
        "observation_value_repro3": "ZTD_repro3",
        "report_timestamp": "Date",
        "uncertainty_value1_daily": "sigma_ZTD_daily",
        "uncertainty_value1_repro3": "sigma_ZTD_repro3",
    }
)
df_diff_daily_repro3 = df_diff_daily_repro3.set_index("Station")
In [ ]:
df_diff_daily_repro3.to_parquet(
    f"{base_dir}/full_time_series_data/diff_zenith_total_delay_igs_daily_igs_repro3.parquet"
)

Compute ΔIWV [IGSDaily - IGSRepro3]¶

In [ ]:
df_repro3 = df_PWC_igs_repro3.reset_index()[
    [
        "primary_station_id",
        "report_timestamp",
        "observation_value",
        "uncertainty_value5",
    ]
]
df_daily = df_PWC_igs_daily.reset_index()[
    [
        "primary_station_id",
        "report_timestamp",
        "observation_value",
        "uncertainty_value5",
    ]
]

df_diff_daily_repro3 = pd.merge(
    df_daily,
    df_repro3,
    on=["primary_station_id", "report_timestamp"],
    how="inner",
    suffixes=("_daily", "_repro3"),
)
del df_daily
del df_repro3
In [ ]:
df_diff_daily_repro3["PWC_diff"] = (
    df_diff_daily_repro3["observation_value_daily"]
    - df_diff_daily_repro3["observation_value_repro3"]
)
df_diff_daily_repro3 = df_diff_daily_repro3.rename(
    columns={
        "primary_station_id": "Station",
        "observation_value_daily": "PWC_daily",
        "observation_value_repro3": "PWC_repro3",
        "report_timestamp": "Date",
        "uncertainty_value5_daily": "sigma_PWC_daily",
        "uncertainty_value5_repro3": "sigma_PWC_repro3",
    }
)
df_diff_daily_repro3 = df_diff_daily_repro3.set_index("Station")
In [ ]:
df_diff_daily_repro3.to_parquet(
    f"{base_dir}/full_time_series_data/diff_total_column_water_vapour_igs_daily_igs_repro3.parquet"
)

Compute ΔPWC[IGSrepro3 - ERA5]¶

In [14]:
df_repro3 = df_PWC_igs_repro3.reset_index()[
    ["primary_station_id", "report_timestamp", "observation_value"]
]
df_era5 = df_PWC_era5_igs_repro3.reset_index()[
    ["primary_station_id", "report_timestamp", "observation_value"]
]

df_diff_repro3_era5 = pd.merge(
    df_repro3,
    df_era5,
    on=["primary_station_id", "report_timestamp"],
    how="inner",
    suffixes=("_repro3", "_era5"),
)
del df_repro3
del df_era5
In [15]:
df_diff_repro3_era5["PCW_diff"] = (
    df_diff_repro3_era5["observation_value_repro3"]
    - df_diff_repro3_era5["observation_value_era5"]
)
df_diff_repro3_era5 = df_diff_repro3_era5.rename(
    columns={
        "primary_station_id": "Station",
        "observation_value_repro3": "PCW_repro3",
        "observation_value_era5": "PCW_era5",
        "report_timestamp": "Date",
    }
)
df_diff_repro3_era5 = df_diff_repro3_era5.set_index("Station")
In [17]:
df_diff_repro3_era5.to_parquet(
    f"{base_dir}/full_time_series_data/diff_total_column_water_vapour_igs_repro3_era5_{starting_year}_{ending_year}.parquet"
)

Compute ΔPWC[IGSdaily - ERA5]¶

In [22]:
df_daily = df_PWC_igs_daily.reset_index()[
    ["primary_station_id", "report_timestamp", "observation_value"]
]
df_era5 = df_PWC_era5_igs_daily.reset_index()[
    ["primary_station_id", "report_timestamp", "observation_value"]
]

df_diff_daily_era5 = pd.merge(
    df_daily,
    df_era5,
    on=["primary_station_id", "report_timestamp"],
    how="inner",
    suffixes=("_daily", "_era5"),
)
del df_daily
del df_era5
In [23]:
df_diff_daily_era5["PCW_diff"] = (
    df_diff_daily_era5["observation_value_daily"]
    - df_diff_daily_era5["observation_value_era5"]
)
df_diff_daily_era5 = df_diff_daily_era5.rename(
    columns={
        "primary_station_id": "Station",
        "observation_value_daily": "PCW_daily",
        "observation_value_era5": "PCW_era5",
        "report_timestamp": "Date",
    }
)
df_diff_daily_era5 = df_diff_daily_era5.set_index("Station")
In [25]:
df_diff_daily_era5.to_parquet(
    f"{base_dir}/full_time_series_data/diff_total_column_water_vapour_igs_daily_era5_{starting_year}_{ending_year}.parquet"
)

The same kind of differences can be computed for the other datasets (eg included EPNrepro2), but we will not do it here.


Visualize the intercomparison of the different datasets¶

Dataframes¶

We don't take into account the EPNrepro2 dataset in this analysis, but it can be done in the same way as for the IGS datasets.

In [188]:
df_diff_ztd_daily_repro3 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/diff_zenith_total_delay_igs_daily_igs_repro3.parquet"
)
df_diff_pwc_daily_repro3 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/diff_total_column_water_vapour_igs_daily_igs_repro3.parquet"
)
df_diff_pwc_repro3_era5 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/diff_total_column_water_vapour_igs_repro3_era5_2000_2019.parquet"
)
df_diff_pwc_daily_era5 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/diff_total_column_water_vapour_igs_daily_era5_2000_2023.parquet"
)

Get rid of the station with less than 3 years of data¶

In [ ]:
dfs = {
    "ΔZTD[IGSdaily - IGSrepro3]": df_diff_ztd_daily_repro3,
    "ΔPWC[IGSdaily - ERA5]": df_diff_pwc_daily_era5,
    "ΔPWC[IGSrepro3 - ERA5]": df_diff_pwc_repro3_era5,
    "ΔPWC[IGSdaily - IGSrepro3]": df_diff_pwc_daily_repro3,
}

for i, (name, df) in enumerate(dfs.items()):
    counts = df.groupby("Station").size()
    counts_sorted = counts.sort_values(ascending=True)
    low_data_stations = counts_sorted[counts_sorted < 26280].index.tolist()

    print(f"Stations with less than 26280 days: {len(low_data_stations)}")
    if len(low_data_stations) > 0:
        print(f"Example stations with low data: {low_data_stations[:5]}")

    if "low_data_stations_dict" not in globals():
        globals()["low_data_stations_dict"] = {}

    globals()["low_data_stations_dict"][name] = low_data_stations

    if len(low_data_stations) > 0:
        df = dfs[name]
        df_filtered = df[~df.index.get_level_values("Station").isin(low_data_stations)]
        filtered_df_name = f"df_diff_{name.replace(' - ', '_')}_filtered"
        globals()[filtered_df_name] = df_filtered
        removed_rows = len(df) - len(df_filtered)
        print(
            f"\nFiltered out {removed_rows} rows ({removed_rows/len(df)*100:.2f}%) from {len(low_data_stations)} stations"
        )
        print(
            f"Original dataframe: {len(df)} rows, Filtered dataframe: {len(df_filtered)} rows"
        )

print("Saving filtered dataframes to parquet files...")
pwc_daily_era5_filtered = globals()["df_diff_ΔPWC[IGSdaily_ERA5]_filtered"]
pwc_repro3_era5_filtered = globals()["df_diff_ΔPWC[IGSrepro3_ERA5]_filtered"]
pwc_daily_repro3_filtered = globals()["df_diff_ΔPWC[IGSdaily_IGSrepro3]_filtered"]
ztd_daily_repro3_filtered = globals()["df_diff_ΔZTD[IGSdaily_IGSrepro3]_filtered"]

pwc_daily_era5_filtered.to_parquet(
    f"{base_dir}/full_time_series_data/Over_3_years_diff_total_column_water_vapour_igs_daily_era5_2000_2023.parquet"
)
pwc_repro3_era5_filtered.to_parquet(
    f"{base_dir}/full_time_series_data/Over_3_years_diff_total_column_water_vapour_igs_repro3_era5_2000_2019.parquet"
)
pwc_daily_repro3_filtered.to_parquet(
    f"{base_dir}/full_time_series_data/Over_3_years_diff_total_column_water_vapour_igs_daily_igs_repro3.parquet"
)
ztd_daily_repro3_filtered.to_parquet(
    f"{base_dir}/full_time_series_data/Over_3_years_diff_zenith_total_delay_igs_daily_igs_repro3.parquet"
)

print("All filtered dataframes saved successfully.")
Stations with less than 26280 days: 113
Example stations with low data: ['FAIV00USA', 'TABV00USA', 'KWJ100MHL', 'SUNM00AUS', 'OBET00DEU']

Filtered out 1489728 rows (2.28%) from 113 stations
Original dataframe: 65283864 rows, Filtered dataframe: 63794136 rows
Stations with less than 26280 days: 17
Example stations with low data: ['MTBG00AUT', 'FREE00BHS', 'KWJ100MHL', 'TGCV00CPV', 'MDVO00RUS']

Filtered out 309532 rows (0.68%) from 17 stations
Original dataframe: 45666183 rows, Filtered dataframe: 45356651 rows
Stations with less than 26280 days: 68
Example stations with low data: ['MTBG00AUT', 'BOAV00BRA', 'WTZ300DEU', 'KWJ100MHL', 'SUNM00AUS']

Filtered out 934346 rows (2.49%) from 68 stations
Original dataframe: 37479676 rows, Filtered dataframe: 36545330 rows
Stations with less than 26280 days: 68
Example stations with low data: ['MTBG00AUT', 'BOAV00BRA', 'WTZ300DEU', 'KWJ100MHL', 'SUNM00AUS']

Filtered out 934346 rows (2.49%) from 68 stations
Original dataframe: 37479676 rows, Filtered dataframe: 36545330 rows
Saving filtered dataframes to parquet files...
All filtered dataframes saved successfully.

Dataframes (only stations with more than 3 years of data)¶

We did not compute any differences for the EPNrepro2 dataset. Therefore, no filtered DataFrames are available for this case.

If you choose not to process differences for a given dataset, please define it as an empty DataFrame with two columns, depending on the variable being compared:

For Zenith Total Delay (ZTD):
In [33]:
pd.DataFrame({'Station': [], 'ZTD_diff': []}).set_index('Station')
Out[33]:
ZTD_diff
Station
For Precipitable Water Content (PWC):
In [32]:
pd.DataFrame({'Station': [], 'PWC_diff': []}).set_index('Station')
Out[32]:
PWC_diff
Station
In [59]:
df_diff_ztd_daily_repro3 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/Over_3_years_diff_zenith_total_delay_igs_daily_igs_repro3.parquet"
)
df_diff_ztd_daily_repro2 = pd.DataFrame({'Station': [], 'ZTD_diff': []}).set_index('Station')
df_diff_ztd_repro3_repro2 = pd.DataFrame({'Station': [], 'ZTD_diff': []}).set_index('Station')



df_diff_pwc_daily_repro3 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/Over_3_years_diff_total_column_water_vapour_igs_daily_igs_repro3.parquet"
)
df_diff_pwc_daily_repro2 = pd.DataFrame({'Station': [], 'PWC_diff': []}).set_index('Station')
df_diff_pwc_repro3_repro2 = pd.DataFrame({'Station': [], 'PWC_diff': []}).set_index('Station')



df_diff_pwc_repro3_era5 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/Over_3_years_diff_total_column_water_vapour_igs_repro3_era5_2000_2019.parquet"
)
df_diff_pwc_daily_era5 = pd.read_parquet(
    f"{base_dir}/full_time_series_data/Over_3_years_diff_total_column_water_vapour_igs_daily_era5_2000_2023.parquet"
)
df_diff_pwc_repro2_era5 = pd.DataFrame({'Station': [], 'PWC_diff': []}).set_index('Station')
In [60]:
dataframes_dict = {
    "ΔZTD[IGSdaily - IGSrepro3]": df_diff_ztd_daily_repro3,
    "ΔZTD[IGSdaily - EPNrepro2]": df_diff_ztd_daily_repro2,
    "ΔZTD[IGSrepro3 - EPNrepro2]": df_diff_ztd_repro3_repro2,

    "ΔPWC[IGSdaily - IGSrepro3]": df_diff_pwc_daily_repro3,
    "ΔPWC[IGSdaily - EPNrepro2]": df_diff_pwc_daily_repro2,
    "ΔPWC[IGSrepro3 - EPNrepro2]": df_diff_pwc_repro3_repro2,

    "ΔPWC[IGSdaily - ERA5]": df_diff_pwc_daily_era5,
    "ΔPWC[IGSrepro3 - ERA5]": df_diff_pwc_repro3_era5,
    "ΔPWC[EPNrepro2 - ERA5]": df_diff_pwc_repro2_era5,
} 

# Name of the keys are important and must remain the same as shown 
# Adapt the name of your dataframes if you want to use intercomparison with EPNrepro2

diff_columns = {
    "ΔZTD[IGSdaily - IGSrepro3]": "ZTD_diff",
    "ΔZTD[IGSdaily - EPNrepro2]": "ZTD_diff",
    "ΔZTD[IGSrepro3 - EPNrepro2]": "ZTD_diff",

    "ΔPWC[IGSdaily - IGSrepro3]": "PWC_diff",
    "ΔPWC[IGSdaily - EPNrepro2]": "PWC_diff",
    "ΔPWC[IGSrepro3 - EPNrepro2]": "PWC_diff",

    "ΔPWC[IGSdaily - ERA5]": "PWC_diff",
    "ΔPWC[IGSrepro3 - ERA5]": "PWC_diff",
    "ΔPWC[EPNrepro2 - ERA5]": "PWC_diff",
}

General stats¶

In [64]:
station_stats_dict = {}

for label, df in dataframes_dict.items():
    diff_column = diff_columns[label]

    station_stats = df.groupby("Station")[diff_column].agg(["median"]).reset_index()
    station_stats["IQR"] = (
        df.groupby("Station")[diff_column]
        .apply(lambda x: x.quantile(0.75) - x.quantile(0.25))
        .values
    )
    station_stats["mean"] = df.groupby("Station")[diff_column].mean().values
    station_stats["abs_median"] = station_stats["median"].abs()
    station_stats["abs_mean"] = station_stats["mean"].abs()
    station_stats["std"] = df.groupby("Station")[diff_column].std().values
    station_stats["IQR/1.35"] = station_stats["IQR"] / 1.35
    station_stats_dict[label] = station_stats

station_stats_PWC_REPRO3_ERA5 = station_stats_dict["ΔPWC[IGSrepro3 - ERA5]"]
station_stats_PWC_DAILY_ERA5 = station_stats_dict["ΔPWC[IGSdaily - ERA5]"]
station_stats_ZTD_DAILY_REPRO3 = station_stats_dict["ΔZTD[IGSdaily - IGSrepro3]"]
station_stats_PWC_DAILY_REPRO3 = station_stats_dict["ΔPWC[IGSdaily - IGSrepro3]"]
In [65]:
station_stats_ZTD_DAILY_REPRO3_mm = station_stats_ZTD_DAILY_REPRO3.copy()
numeric_columns = ["median", "IQR", "mean", "abs_median", "abs_mean", "std", "IQR/1.35"]
station_stats_ZTD_DAILY_REPRO3_mm[numeric_columns] = (
    station_stats_ZTD_DAILY_REPRO3[numeric_columns] * 1000
)

ΔZTD[IGSdaily - IGSrepro3]¶

In [66]:
vf.plot_scatter_joint_stats(
    station_stats_ZTD_DAILY_REPRO3_mm,
    df_equipment,
    dataframes_dict,
    datatype="ZTD",
    x_col="mean",
    y_col="std",
    title="IGSdaily - IGSrepro3",
    xlabel="Mean[ΔZTD] [mm]",
    ylabel="std[ΔZTD] [mm]",
    xlim=(pd.to_datetime("2000-01-01"), pd.to_datetime("2019-12-31")),
    ylim=[-0.1, 0.1],
    annotate_worst=False,
    n_worst=5,
    annotate_threshold_worst=True,
    threshold_mean=(-5, 8),
    threshold_std=13,
    show_grid=True,
    show_stats=True,
    save=False,
    save_path=f"{base_dir}/full_time_series_data/Joint_ZTD_DAILY_REPRO3.png",
    dpi=600,
)
<Figure size 1000x600 with 0 Axes>
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Out[66]:
<seaborn.axisgrid.JointGrid at 0x72ebad6ea440>

ΔPWC[IGSdaily - IGSrepro3]¶

In [67]:
vf.plot_scatter_joint_stats(
    station_stats_PWC_DAILY_REPRO3,
    df_equipment,
    dataframes_dict,
    x_col="mean",
    y_col="std",
    title="IGSdaily - IGSrepro3",
    xlabel="Mean[ΔPWC] [kg/m²]",
    ylabel="std[ΔPWC] [kg/m²]",
    xlim=(pd.to_datetime("2000-01-01"), pd.to_datetime("2019-12-31")),
    ylim=[-10, 10],
    annotate_worst=False,
    n_worst=5,
    annotate_threshold_worst=True,
    threshold_mean=(-0.5, 1.5),
    threshold_std=2.5,
    show_grid=True,
    show_stats=True,
    save=False,
    save_path=f"{base_dir}/full_time_series_data/Joint_PWC_DAILY_REPRO3.png",
    dpi=600,
)
<Figure size 1000x600 with 0 Axes>
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Out[67]:
<seaborn.axisgrid.JointGrid at 0x72e9156c6fb0>

ΔPWC[IGSdaily - ERA5]¶

In [68]:
vf.plot_scatter_joint_stats(
    station_stats_PWC_DAILY_ERA5,
    df_equipment,
    dataframes_dict,
    x_col="mean",
    y_col="std",
    title="IGSdaily - ERA5",
    xlabel="Mean[ΔPWC] [kg/m²]",
    ylabel="std[ΔPWC] [kg/m²]",
    xlim=(pd.to_datetime("2000-01-01"), pd.to_datetime("2023-12-31")),
    annotate_worst=False,
    n_worst=5,
    annotate_threshold_worst=True,
    threshold_mean=(-3, 2),
    threshold_std=3.5,
    show_grid=True,
    show_stats=True,
    save=False,
    save_path=f"{base_dir}/full_time_series_data/Joint_PWC_DAILY_ERA5.png",
    dpi=600,
)
<Figure size 1000x600 with 0 Axes>
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Out[68]:
<seaborn.axisgrid.JointGrid at 0x72ebe223df30>

ΔPWC[IGSrepro3 - ERA5]¶

In [69]:
vf.plot_scatter_joint_stats(
    station_stats_PWC_REPRO3_ERA5,
    df_equipment,
    dataframes_dict,
    x_col="mean",
    y_col="std",
    title="IGSrepro3 - ERA5",
    xlabel="Mean[ΔPWC] [kg/m²]",
    ylabel="std[ΔPWC] [kg/m²]",
    xlim=(pd.to_datetime("2000-01-01"), pd.to_datetime("2023-12-31")),
    annotate_worst=False,
    n_worst=5,
    annotate_threshold_worst=True,
    threshold_mean=(-3, 2),
    threshold_std=3.5,
    show_grid=True,
    show_stats=True,
    save=False,
    save_path=f"{base_dir}/full_time_series_data/Joint_PWC_REPRO3_ERA5.png",
    dpi=600,
)
<Figure size 1000x600 with 0 Axes>
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
Out[69]:
<seaborn.axisgrid.JointGrid at 0x72ebe2d166b0>

Select your station and plot the time serie¶

In [70]:
vf.plot_diff_time_series(
    dataframes_dict,
    "MAT100ITA",
    "ZTD_diff",
    labels=["ΔZTD[IGSdaily - IGSrepro3]"],
    equipment_change=df_equipment,
    ylim=None,
    xlim=(pd.to_datetime("01-01-2008"), pd.to_datetime("31-12-2020")),
    save=False,
)
/tmp/ipykernel_583776/4210248729.py:8: UserWarning: Parsing dates in %d-%m-%Y format when dayfirst=False (the default) was specified. Pass `dayfirst=True` or specify a format to silence this warning.
  xlim=(pd.to_datetime("01-01-2008"), pd.to_datetime("31-12-2020")),
No description has been provided for this image
In [71]:
vf.plot_diff_time_series(
    dataframes_dict,
    "BAMF00CAN",
    "PWC_diff",
    labels=["ΔPWC[IGSdaily - ERA5]"],
    equipment_change=df_equipment,
    ylim=None,
    xlim=(pd.to_datetime("01-01-2008"), pd.to_datetime("31-12-2023")),
    save=False,
)
/tmp/ipykernel_583776/4134070653.py:8: UserWarning: Parsing dates in %d-%m-%Y format when dayfirst=False (the default) was specified. Pass `dayfirst=True` or specify a format to silence this warning.
  xlim=(pd.to_datetime("01-01-2008"), pd.to_datetime("31-12-2023")),
No description has been provided for this image